Integration of Semistructured Data Using Outer Joins

نویسنده

  • Koichi Munakata
چکیده

When we integrate information sources that are managed locally, the integrated objects are often semistructured, since di erent sets of information sources participate in forming each of the objects. Such integration requires outer join and conversion operators: outer join operators prevent information loss that would be caused by inner joins, and conversion operators are used to mediate the inconsistency among information sources. These operators impose restrictions on the order of query processing. This paper presents an algorithm for creating a processing plan to integrate information sources without information loss, in the presence of conversion operators. As a framework of our discussion, we use the TSIMMIS mediation system developed at Stanford University. Given a view speci cation and a query, we create an initial expression graph whose nodes represent query processing units, including outer join and conversion operators. Then, we convert the initial expression graph into an executable expression tree.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

انتخاب مناسب‌ترین زبان پرس‌وجو برای استفاده از فرا‌‌پیوندها جهت استخراج داده‌ها در حالت دیتالوگ در سامانه پایگاه داده استنتاجی DES

Deductive Database systems are designed based on a logical data model. Data (as opposed to Relational Databases Management System (RDBMS) in which data stored in tables) are saved as facts in a Deductive Database system. Datalog Educational System (DES) is a Deductive Database system that Datalog mode is the default mode in this system. It can extract data to use outer joins with three query la...

متن کامل

Efficient Skew Handling for Outer Joins in a Cloud Computing Environment

Outer joins are ubiquitous in many workloads and Big Data systems. The question of how to best execute outer joins in large parallel systems is particularly challenging, as real world datasets are characterized by data skew leading to performance issues. Although skew handling techniques have been extensively studied for inner joins, there is little published work solving the corresponding prob...

متن کامل

Outer Joins and Filters for Instantiating Objects from Relational Databases Through Views

One of the approaches for integrating object-oriented programs with databases is to instantiate objects from relational databases by evaluating view queries. In that approach, it is often necessary to evaluate some joins of the query by left outer joins to prevent information loss caused by the tuples discarded by inner joins. It is also necessary to lter some relations with selection condition...

متن کامل

Efficient Large Outer Joins over MapReduce

Big Data analytics largely rely on being able to execute large joins efficiently. Though inner join approaches have been extensively evaluated in parallel and distributed systems, there is little published work providing analysis of outer joins, especially on the extremely popular MapReduce platform. In this paper, we studied several current algorithms/techniques used in large outer joins. We f...

متن کامل

Efficient Outer Join Data Skew Handling in Parallel DBMS

Large enterprises have been relying on parallel database management systems (PDBMS) to process their ever-increasing data volume and complex queries. The scalability and performance of a PDBMS comes from load balancing on all nodes in the system. Skewed processing will significantly slow down query response time and degrade the overall system performance. Business intelligence tools used by ent...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997